An exact nonparametric method for inferring mosaic structure in sequence triplets.
نویسندگان
چکیده
Statistical tests for detecting mosaic structure or recombination among nucleotide sequences usually rely on identifying a pattern or a signal that would be unlikely to appear under clonal reproduction. Dozens of such tests have been described, but many are hampered by long running times, confounding of selection and recombination, and/or inability to isolate the mosaic-producing event. We introduce a test that is exact, nonparametric, rapidly computable, free of the infinite-sites assumption, able to distinguish between recombination and variation in mutation/fixation rates, and able to identify the breakpoints and sequences involved in the mosaic-producing event. Our test considers three sequences at a time: two parent sequences that may have recombined, with one or two breakpoints, to form the third sequence (the child sequence). Excess similarity of the child sequence to a candidate recombinant of the parents is a sign of recombination; we take the maximum value of this excess similarity as our test statistic Delta(m,n,b). We present a method for rapidly calculating the distribution of Delta(m,n,b) and demonstrate that it has comparable power to and a much improved running time over previous methods, especially in detecting recombination in large data sets.
منابع مشابه
Inferring phylogenetic relationships avoiding forbidden rooted triplets
To construct a phylogenetic tree or phylogenetic network for describing the evolutionary history of a set of species is a well-studied problem in computational biology. One previously proposed method to infer a phylogenetic tree/network for a large set of species is by merging a collection of known smaller phylogenetic trees on overlapping sets of species so that no (or as little as possible) b...
متن کاملAn Investigation on characterization of cucumber mosaic virus isolated from lily green house in Damavand County, Iran
Background and Aims: Virus infections represent some of the most important diseases of lily, plants because of the devastating effects caused to the crops and the absence of effective treatments. A survey for virus diseases of lilies, revealed the occurrence of Cucumber mosaic virus (CMV) in plants growing in Tehran province, Iran. Materials and Methods: During 2013, 50 lily samples with virus-...
متن کاملTowards an accurate identification of mosaic genes and partial horizontal gene transfers
Many bacteria and viruses adapt to varying environmental conditions through the acquisition of mosaic genes. A mosaic gene is composed of alternating sequence polymorphisms either belonging to the host original allele or derived from the integrated donor DNA. Often, the integrated sequence contains a selectable genetic marker (e.g. marker allowing for antibiotic resistance). An effective identi...
متن کاملMethodology for Inferring Moral Priorities According to the Narrations of "Afal Tafzil"
Considering the different levels of moral values in Islam, in order to know the most important values and also to eliminate the contradiction, it is necessary to deduce from the texts of verses and hadiths. One of the most important aspects in these texts is the "structure of Tafzil". Some narrations of this structure indicate the priority of one or more values and others indicate a rule in det...
متن کاملBayesian approach to inference of population structure
Methods of inferring the population structure, its applications in identifying disease models as well as foresighting the physical and mental situation of human beings have been finding ever-increasing importance. In this article, first, motivation and significance of studying the problem of population structure is explained. In the next section, the applications of inference of p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Genetics
دوره 176 2 شماره
صفحات -
تاریخ انتشار 2007